Retrieval Models versus Retrievability

نویسندگان

  • Shariq Bashir
  • Andreas Rauber
  • Bashir
  • Rauber
چکیده

Retrievability is an important measure in information retrieval that can be used to analyze retrieval models and document collections. Rather than just focusing on a set of few documents that are given in the form of relevance judgments, retrievability examines what is retrieved, how frequently it is retrieved, and how much effort is needed to retrieve it. Such a measure is of particular interest within the recall oriented retrieval systems (e.g. patent or legal retrieval), because in this context a document needs to be retrieved before it can be judged for relevance. If a retrieval model makes some patents hard to find, patent searchers could miss relevant documents just because of the bias of the retrieval model. In this chapter we explain the concept of retrievability in information retrieval. We also explain how it can be estimated and how it can be used for analysing a retrieval bias of retrieval models. We also show how retrievability relates to effectiveness by analysing the relationship between retrievability and effectiveness measures and how the retrievability measure can be used to improve effectiveness. Shariq Bashir Department of Computer Science Mohammad Ali Jinnah University, Islamabad e-mail: [email protected] Andreas Rauber Institute of Software Technology and Interactive Systems TU Wien e-mail: [email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the relationship between query characteristics and IR functions retrieval bias

Bias quantification of retrieval functions with the help of document retrievability scores has recently evolved as an important evaluation measure for recall-oriented retrieval applications.While numerous studies have evaluated retrieval bias of retrieval functions, solid validation of its impact on realistic types of queries is still limited. This is due to the lack of well-accepted criteria f...

متن کامل

Efficiently Estimating Retrievability Bias

Retrievability is the measure of how easily a document can be retrieved using a particular retrieval system. The extent to which a retrieval system favours certain documents over others (as expressed by their retrievability scores) determines the level of bias the system imposes on a collection. Recently it has been shown that it is possible to tune a retrieval system by minimising the retrieva...

متن کامل

Improving Retrievability and Recall by Automatic Corpus Partitioning

With increasing volumes of data, much effort has been devoted to finding the most suitable answer to an information need. However, in many domains, the question whether any specific information item can be found at all via a reasonable set of queries is essential. This concept of Retrievability of information has evolved into an important evaluation measure of IR systems in recall-oriented appl...

متن کامل

An Improved Retrievability-Based Cluster-Resampling Approach for Pseudo Relevance Feedback

Cluster-based pseudo-relevance feedback (PRF) is an effective approach for searching relevant documents for relevance feedback. Standard approach constructs clusters for PRF only on the basis of high similarity between retrieved documents. The standard approach works quite well if the retrieval bias of the retrieval model does not create any effect on the retrievability of documents. In our exp...

متن کامل

Analyzing Document Retrievability in Patent Retrieval Settings

Most information retrieval settings, such as web search, are typically precision-oriented, i.e. they focus on retrieving a small number of highly relevant documents. However, in specific domains, such as patent retrieval or law, recall becomes more relevant than precision: in these cases the goal is to find all relevant documents, requiring algorithms to be tuned more towards recall at the cost...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016